Journal article
ChemTables: a dataset for semantic classification on tables in chemical patents
Z Zhai, C Druckenbrodt, C Thorne, SA Akhondi, DQ Nguyen, T Cohn, K Verspoor
Journal of Cheminformatics | BMC | Published : 2021
Abstract
Chemical patents are a commonly used channel for disclosing novel compounds and reactions, and hence represent important resources for chemical and pharmaceutical research. Key chemical data in patents is often presented in tables. Both the number and the size of tables can be very large in patent documents. In addition, various types of information can be presented in tables in patents, including spectroscopic and physical data, or pharmacological use and effects of chemicals. Since images of Markush structures and merged cells are commonly used in these tables, their structure also shows substantial variation. This heterogeneity in content and structure of tables in chemical patents makes ..
View full abstractGrants
Awarded by Australian Research Council
Funding Acknowledgements
Funding for the ChEMU project is provided by an Australian Research Council Linkage Project, project number LP160101469, and Elsevier. ZZ receives support from the University of Melbourne via a Melbourne Research Scholarship. We acknowledge the work of Panagiotis Eustratiadis who contributed to the preparation of the ChemTables dataset while employed at Elsevier.